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Abstract 

The Bloom filter (BF) is a well-known space-efficient data structure that answers set membership queries with some 
probability of false positives. In an attempt to solve many of the limitations of current inter-networking architectures, 
some recent proposals rely on including small BFs in packet headers for routing, security, accountability or other 
purposes that move application states into the packets themselves. In this paper, we consider the design of such in- 
packet Bloom filters (iBF). Our main contributions are exploring the design space and the evaluation of a series of 
extensions (1) to increase the practicality and performance of iBFs, (2) to enable false-negative-free element deletion, 
and (3) to provide security enhancements. In addition to the theoretical estimates, extensive simulations of the multiple 
design parameters and implementation alternatives validate the usefulness of the extensions, providing for enhanced 
and novel iBF networking applications. 
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1. Introduction 

Since the seminal survey work by HI, Bloom fil- 
ters (BF) ||2|] have increasingly become a fundamen- 
tal data aggregation component to address performance 
and scalability issues of very diverse network applica- 
tions, including overlay networks yfl, data-centric rout- 
ing J3], traffic monitoring, and so on. In this work, we 
focus on the subset of distributed networking applica- 
tions based on packet-header-size Bloom filters to share 
some state (information set) among network nodes. The 
specific state carried in the Bloom filter varies from ap- 
plication to application, ranging from secure creden- 
tials HIl] to IP prefixes H and link identifiers fl, with 
the shared requirement of a fixed-size packet header 
data structure to efficiently verify set memberships. 

Considering the constraints faced by the implemen- 
tation of next generation networks (e.g., Gbps speeds, 
increasingly complex tasks, larger systems, high-speed 
memory availability, etc.), recent inter-networking pro- 
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posals fll^l^l^l^ fioll chose to include more informa- 
tion in the packet headers to keep pace with the increas- 
ing speed and needs of Internet-scale systems. Moving 
state to the packets themselves helps to alleviate sys- 
tem bottlenecks (e.g., IP multicast [7], source routing 
overhead JUl) and enables new in-network applications 
(e.g., security , traceback [9]) or stateless protocol 
designs Hill . 

We refer to the BF used in this type of applications as 
an in-packet Bloom filter (iBF). In a way, iBFs follow a 
reverse approach compared to traditional standalone BF 
implementations: iBFs can be issued, queried, and mod- 
ified by multiple network entities at packet processing 
time. These specific needs may benefit from additional 
capabilities like element removals or security enhance- 
ments. Moreover, careful BF design considerations are 
required to deal with the potential effects of false pos- 
itives, as every packet header bit counts and the actual 
performance of the distributed system is a key goal. 

In this paper, we address common limitations of 
naive iBF designs and provide a practical foundation 
for networking application designs requiring to solve 
set-membership problems on a packet basis (§|3J. Our 
main contribution consists of assembling and evaluat- 
ing a series of practical extensions (i) to increase the 
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system performance, (ii) to enable false-negative-free 
element deletion, and (iii) to provide security-enhanced 
constructs at wire speed (§ HJ. Via extensive simula- 
tion work, we explore the rich design space and provide 
a thorough evaluation of the observed trade-offs (§ |5). 
Finally, we relate our contributions to previous work on 
Bloom filter designs and briefly discuss the applicability 
of the iBF extensions to existing applications (§|6]l. 



2. Networking applications 

iBFs are well suited for applications where one might 
like to include a list of elements in every packet, but a 
complete list requires too much space. In these situa- 
tions, a hash-based lossy representation, like a BF, can 
dramatically reduce space, maintaining a fixed header 
size, at the cost of introducing false positives when 
answering set-membership queries. From its original 
higher layer applications such as dictionaries, BFs have 
spanned their application domain down to hardware im- 
plementations, becoming a daily aid in network applica- 
tions (e.g., routing table lookups, DPI, etc.) and future 
information-oriented networking proposals [12]. As a 
motivation to our work and to get some practical exam- 
ples of iBF usages, we first briefly survey a series of 
networking applications with the common theme of us- 
ing small BFs carried in packets. 



2.1. Data path security 

The credential-based data path architecture intro- 
duced in lU proposes novel network router security 
features. During the connection establishment phase, 
routers authorize a new traffic flow request and issue 
a set of credentials (aka capabilities) compactly repre- 
sented as iBF bit positions. The flow initiator constructs 
the credentials by including all the router signatures into 
an iBF. Each router along the path checks on packet ar- 
rival for presence of its credentials, i.e., the k bits result- 
ing from hashing the packet 5-tuple IP flow identifier 
and the routers (secret) identity. Hence, unauthorized 
traffic and flow security violations can be probabilisti- 
cally avoided per hop in a stateless fashion. Using 128 
bits only, the iBF-based authorization token reduces the 
probability that attack traffic reaches its destination to a 
fraction of a percent for typical Internet path lengths. 

2.2. Wireless sensor networks 

A typical attack by compromised sensor nodes con- 
sists of injecting large quantities of bogus sensing 
reports, which, if undetected, are forwarded to the 
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proach [6] proposes a detection method based on an 
iBF representation of the report generation (collection 
of keyed message authentications), that is verified prob- 
abilistically and dropped en-route in case of incorrect- 
ness. The iBF-based solution uses 64 bits only and 
is able to filter out 70% of the injected bogus reports 
within 5 hops, and up to 90% within 10 hops along the 
paths to the data sink. 

2.3. IP traceback 

The packet-marking IP traceback method proposed 
in |0] rehes on iBFs to trace an attack back to its approx- 
imate source by analyzing a single packet. On packet 
arrival, routers insert their mark (IP mask) into the iBF, 
enabling a receiver to reconstruct probabilistically the 
packet path(s) by testing for iBF presence of neighbor- 
ing router addresses. 

2.4. Loop prevention 

In Icarus JlOD, a small iBF is initialized with 0s and 
gets filled as forwarding elements includes adding the 
Bloom masks (size m, k bits set to 1) of the interfaces 
they pass to the iBF. If the OR operation does not change 
the iBF, then the packet might be looping and should 
be dropped. If the Bloom filter changes, the packet is 
definitely not looping. 

2.5. IP multicast 

Revisiting the case of IP multicast, the authors of J3] 
propose inserting an iBF above the IP header to rep- 
resent domain-level paths of multicast packets. After 
discovering the dissemination tree of a specific multi- 
cast group, the source border router searches its inter- 
domain routing table to find the prefixes of the group 
members. It then builds an 800-bit shim header by in- 
serting the path labels (AS a : AS b) of the dissemination 
tree into the iBF. Forwarding routers receiving the iBF 
check for presence of next hop autonomous systems and 
forward the packet accordingly. 

2.6. Source routing & multicast 

The LIPSIN ffi forwarding fabric leverages the idea 
of having interface identifiers in BF-form (m-bit Link 
ID with only k bits set to 1). A routing iBF can then be 
constructed by ORing the different Link IDs represent- 
ing a source route. Forwarding nodes maintain a small 
Link ID table whose entries are checked for presence in 
the routing iBF to take the forwarding decision. In addi- 
tion to stateless multicast capabilities, a series of exten- 
sions are proposed to have a practical and scalable net- 
working approach (e.g., virtual links, fast recovery, loop 
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detection, false positive blocking, IP inter-networking, 
etc.)- Using a 256-bit iBF in a typical WAN topology, 
multicast trees containing around 40 links can be con- 
structed to reach in a stateless fashion up to 24 users 
while maintaining the false positive rate (« 3%) and the 
associated forwarding efficiency within reasonable per- 
formance levels. 

3. Basic design 

The basic notation of an iBF is equivalent to the stan- 
dard BF, that is an array of length m, k independent hash 
functions, and number of inserted elements n. For the 
sake of generality, we refer simply to elements as the 
objects identified by the iBF and queried by the network 
processing entities. Depending on the specific iBF net- 
working application, elements may take different forms 
such as interface names, IP addresses, certificates, and 
so on. 

On insertion, the element is hashed to the k hash val- 
ues and the corresponding bit positions are set to 1 . On 
element check, if any of the bits determined by the hash 
outputs is 0, we can be sure that the element was not 
inserted (no false negative property). If all the k bits 
are set to 1, we have a probabilistic argument to be- 
lieve that the element was actually inserted. The case 
of collisions to bits set by other elements causing a non- 
inserted element to return "true" is referred to as a false 
positive. In different networking applications, false pos- 
itives manifest themselves with different harmful effects 
such as bandwidth waste, security risks, computational 
overhead, etc. Hence, a system design goal is to keep 
false positives to a minimum. 

3.1. False positive estimates 

It is well-known that the a priori false positive esti- 
mate, fpb, is the expected false positive probability for 
a given set of parameters (m,n,k) before element addi- 
tion: 
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The number k, that minimizes the false positive proba- 
bility, can be obtained by setting the partial derivative 
of fpb with respect to k to 0. This is attained when 
k = ln(2) * — , and is rounded to an integer to determine 
the optimal number of hash functions to be used 1 1 ] . 

While Eq. Q] has been extensively used and experi- 
mentally validated as a good approximation, for small 
values of m the actual false positive rate can be much 
larger. Recently, Bose et al. Ill 311 have shown that fpb 



is actually only a lower bound, and a more accurate es- 
timate can be obtained by formulating the problem as a 
balls-into-bins experiment: 
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which, according to 11131 Theorem 4], can be lower- 
and upper-bounded as follows: 
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The difference of the observed false positive rate and 
the theoretical estimates can be significant for small size 
BFs, a fact that we (and others) have empirically ob- 
served (see evaluation in 95.1.21 . Hence, iBFs are prone 
to more false positives than traditional BFs for equiva- 
lent m/n ratios but larger values of m. 

Both the definition of Eq. Q]and Eq. [2]do not involve 
knowing exactly how many bits are actually set to 1 . A 
more accurate estimate can be given once we know the 
fill factor p; that is the fraction of bits that are actually 
set to 1 after all elements are inserted. We can define the 
posterior false positive estimate, fpa, as the expected 
false positive probability after inserting the elements: 



fpa = p 



(4) 



Finally, the observed false positive probability is the ac- 
tual false positive rate (fpr) that is observed when a 
number of queries are made: 



fpr ■ 



Observed false positives 
Tested elements 



(5) 



Note that the fpr is an experimental quantity computed 
via simulation or system measurements and not a theo- 
retical estimate. Hence, the fpr is the key performance 
indicator we want to measure in a real system, where 
every observed false positive will cause some form of 
degradation. Therefore, practitioners are less interested 
in the asymptotic bounds of the hash-based data struc- 
ture and more concerned with the actual false positive 
rates, especially in the space-constrained scenario of 
tiny iBFs. 

3.2. Naming and operations 

A nice property of hash-based data structures is that 
the performance is independent from the form of the in- 
serted elements. Independently of its size or represen- 
tation, every element carried in the iBF contributes with 
at most k bits set to 1 . In order to meet the line speed 
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requirements on iBF operations, a design recommenda- 
tion is to have the elements readily in a pre-computed 
BF-form (m-bit vector with k bits set to 1), avoiding 
thereby any hashing at packet processing time. Ele- 
ment insertion becomes a simple, parallelizable bitwise 
OR operation. Analogously, an iBF element check can 
be performed very efficiently in parallel via fast bitwise 
AND and COMP operations. 

A BF-ready element name, also commonly referred 
to as element footprint, can be stored as an m-bit bit 
vector or, for space efficiency, it can take a sparse rep- 
resentation including only the indexes of the k bit posi- 
tions set to 1 . In this case, each element entry requires 
only k * log2(m) bits. 

4. Extensions 

In this section, we describe three useful extensions to 
basic Bloom filter designs in order to address the fol- 
lowing practical issues of iBFs: 

(i) Performance: Element Tags exploit the notion of 

power of choices in combining hashing-based el- 
ement names to select the best iBF according to 
some criteria, for instance, less false positives. 

(ii) Deletion: Deletable Regions introduce an addi- 

tional header to code collision-free zones, en- 
abling thereby safe (false-negative-free) element 
removals at an affordable packet header bit space. 

(iii) Security: Secure Constructs use packet-specific 
information and distributed time-based secrets to 
provide protection from iBF replay attacks and bit 
pattern analysis, preventing attackers from misus- 
ing iBFs or trying to infer the identities of the in- 
serted elements. 

4.1. Element Tags 

The concept of element Tags (eTags) is based on 
extending BF-compatible element naming with a set 
of equivalent footprint candidates. That is, instead of 
each element being identified with a single footprint, 
every element is associated with d alternative names, 
called eTags, uniformly computed by applying some 
system-wide mapping function (e.g., k * d hash func- 
tions). That allows us to construct iBFs that can be op- 
timized in terms of the false positive rate and/or com- 
pliance with element-specific false positive avoidance 
strategies. Hence, for each element, there are d differ- 
ent eTags, where d is a system parameter that can vary 



depending on the application. As we see later, a practi- 
cal value of d is in the range of multiples of 2 between 
2 and 64. 

We use the notion of power of choices [ 14] and take 
advantage of the random distribution of the bits set to 1 
to select the iBF representation among the d candidates 
that leads to a better performance given a certain opti- 
mization goal (e.g., lower fill factor, avoidance of spe- 
cific false positives). This way, we follow a similar ap- 
proach to the Best-of-N method applied in lfl5ll . with the 
main differences of (1) a distributed application scenario 
where the d value is carried in the packet header, and 
(2) the best candidate selection criterion is not limited 
to the least amount of bits set but includes any system 
optimization criteria (e.g., § !5.2l bit deletability), includ- 
ing those that involve counting false positives against a 
training set (e.g. § !4.1.2l fpr-based selection). 

The caveats of this extension are, first, it requires 
more space to store element names, and second, the 
value d needs to be stored in the packet header as well, 
consuming bits that could be used for the iBF. However, 
knowing the d value at element query time is fundamen- 
tal to avoid checking multiple element representations. 
Upon packet arrival, the iBF and the corresponding eTag 
entries can be ANDed in parallel. 

4.1.1. Generation of eTags 

To achieve a near uniform distribution of Is in the 
iBF, k independent hash functions per eTag are required. 
In general, k may be different for each eTag, allowing to 
adapt better to different fill factors and reducing the false 
positives of more sensitive elements. Using the double 
hashing technique IU6I1 to compute the bits set to 1 in 
the d eTags, only two independent hash functions are 
required without any increase of the asymptotic false 
positive probability. That is, we rely on the result of 
Kirsch and Mitzenmacher [ 16] on linear combination of 
hash functions, where two independent hash functions 
and can be used to simulate i random hash functions of 
the form: 

gi(x) = [hi(x) + i * h2(x)] mod m (6) 

As long as h\(x) and Ii2{x) are system wide parameters, 
only sharing i — d * k integers is required to derive the 
eTags for any set of elements. For space efficiency, an- 
other optimization for the sparse representation of the 
candidates consists of defining the d eTags by combina- 
tions among k + x iBF positions, i.e., d = 

4.1.2. Candidate selection 

Having "equivalent" iBF candidates enables to de- 
fine a selection criteria based on some design-specific 
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objectives. To address performance by reducing false 
positives, we can select the candidate iBF that presents 
the best posterior false positive estimate (fpa-based se- 
lection; Eq. |4j. If a reference test set is available to 
count for false positives, the iBF choice can be done 
based on the lowest observed rate (fpr-based selection; 
Eq.|5]). Another type of selection policy can be specified 
to favor the candidate presenting less false positives for 
certain "system-critical" elements {element-avoidance- 
based selection). 

4.1.3. False positive improvement estimate 

Following the same analysis as in 111511 . the potential 
gain in terms of false positive reduction due to selecting 
the iBF candidate with fewer Is can be obtained by es- 
timating the least number of bits set after d independent 
random variable experiments (see | Appendix B| for the 
mathematical details). Fig. Q] shows the expected gains 
when using the fpa-based selection after generating d 
candidate iBF for a given element set. With a few dozen 
candidates, one can expect a factor 2 improvement in the 
observed fpr when selecting the candidate with fewer 
ones. Note that the four iBF configurations plotted in 
Fig. Q] have the same m/n ratio. In line with the argu- 
ments of ifTill . smaller bit vectors start with a slightly 
larger false positive estimate. However, as shown in 
Fig. | l(b)| the fpr improvement factor of smaller iBFs 
due to the d-eTag extension is larger. Hence, especially 
for small iBFs, computing d candidates can highly im- 
prove the false positive behavior, a fact that we have 
validated experimentally in § 

4.2. Deletable Regions 

Under some circumstances, a desirable property of 
iBFs is to enable element deletions as the iBF packet 
is processed along network nodes. For instance, this 
is the case when some inserted elements are to be pro- 
cessed only once (e.g., a hop within a source route), or, 
when bit space to add more elements is required. Un- 
fortunately, due to its compression nature, bit collisions 
hamper naive element removals unless we can tolerate 
introducing false negatives into the system. To over- 
come this limitation (with high probability), so-called 
counting Bloom filters (CBF) [17] were proposed to ex- 
pand each bit position to a cell of c bits. In a CBF, each 
bit vector cell acts as a counter, increased on element 
insertion and decreased on element removal. As long 
as there is no counter overflow, deletions are safe from 
false negatives. The main caveat is the c times larger 
space requirements, a very high price for the tiny iBFs 
under consideration. 
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Figure 1 : False positive probability gains of the power of choices ex- 
tension. 



The key idea is to keep track of where the collisions 
occur at element insertion time. By using the property 
that bits set to 1 by just one element (collision-free bits) 
are safely deletable, the proposed extension consists of 
encoding the deletable regions as part of the iBF header. 
Then, an element can be effectively removed if at least 
one of its bits can be deleted. Again, this extension 
should consume a minimum of bits from the allocated 
iBF space. A straightforward coding scheme is to di- 
vide the iBF bit vector into r regions of m! jr bits each, 
where m! is the original m minus the extension header 
bits. As shown in Fig. [2] this extension uses r bits to 
code with a collision-free region and with 1 otherwise 
(non-deletable region). The probability of element dele- 
tion i.e. n elements having at least one bit in a collision- 



free region, can be approximated to (see Appendix A 
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Figure 2: An example of the D1BF with m = 32, k = 3 and r = 4, 
representing the set x,y, z. The Is in the first r bits indicate that a 
collision happen in the corresponding region and bits therein cannot 
be deleted. Since each element has at least one bit in a collision-free 
zone, all of them are deletable. 



for the mathematical details): 
k 



pdr = 1 - 



r(m - r) 



(7) 



Plotting pdr against the number of regions r (see 
Fig. |3(a)) > confirms the intuition that increasing r results 
in a larger proportion of elements being deletable. As 
more elements are inserted into the iBF, the number of 
collisions increase and the deletion capabilities (bits in 
collision-free regions) are reduced (see Fig. |3(b)) . As 
a consequence, the target element deletion probability 
pdr and the number of regions r establish a practical 
limitation on the capacity of n max deletable iBFs. 

From a performance perspective, enabling deletions 
comes at the cost of r bits from the iBF bit space. How- 
ever, removing already processed elements decreases 
the fill factor and consequently reduces the probability 
of false positives upfront. Later in § !5.2l we explore the 
trade-offs between the overhead of coding the deletable 
regions, the impact on the fpr, and the implications of 
the candidate selection criteria. 

4.3. Secure Constructs 

The hashing nature of iBFs provides some inherent 
security properties to obscure the identities of the in- 
serted elements from an observer or attacker. However, 
we have identified a series of cases where improved se- 
curity means are desirable. For instance, an attacker is 
able to infer, with some probability, whether two pack- 
ets contain an overlapping set of elements by simple in- 
specting the bits set to 1 in the iBFs. In another case, 
an attacker may wait and collect a large sample of iBFs 
to infer some common patterns of the inserted elements. 
In any case, if the attacker has knowledge of the com- 
plete element space (and the eTags generation scheme), 
he/she can certainly try a dictionary attack by testing 
for presence of every element and obtain a probabilistic 
answer to what elements are carried in a given iBF. A 
similar problem has been studied in [18] to secure stan- 
dalone BFs representing a summary of documents by 
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Figure 3: Deletability probability (m=256). 

using keyed hash functions. Our approach does not dif- 
fer at the core of the solution i.e. obscuring the result- 
ing bit patterns in the filter by using additional inputs 
to the hashes. However, our attention is focused to the 
specifics of distributed, line-speed iBF operations. 

The main idea to improve the security is to bind 
the iBF element insertion to (1) an invariant of the 
packet or flow (e.g., IP 5-tuple, packet payload, etc.), 
and (2) system-wide time-based secret keys. Basi- 
cally, the inserted elements become packet- and time- 
specific. Hence, an iBF gets expirable and meaning- 
ful only if used with the specific packet (or authorized 
packet flow), avoiding the risk of an iBF replay attack, 
where the iBF is placed as a header on a different packet. 

4.3.1. Binding to packet contents 

We strive to provide a lightweight, bit mixing func- 
tion O = F(K, I) to make an element name K dependent 
on additional in-packet information /. For this exten- 
sion, an element name K is an m-bit hash output and 
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Algorithm 1: Secure iBF element set/check algo- 

rithm. 

Input: element name K, packet-specific id. /, 
param. d 

Output: k bit positions to be set/checked 

1. -LetO = k®I 

2. - Divide O into k segments of m/k bits: 

o u o 2 ,...,o k 

3. - Divide each Oj into a c * log2(m) bit matrix: 

Oj lt Oj 2 ,...,Oj c where c^rafel 

4. - foreach Oj e [O u 0*] do 

Set/check bit position i in the iBF where: 
i-Oj\ ®Oj 2 ® ...®Oj C 
i « d 

end 



not the eTag representation with only k bits set to 1 . The 
function F should be fast enough to be done at packet 
processing time over the complete set of elements to be 
queried by a node processing the iBF. The output O is 
the k bit positions to be set/checked in the iBF. Using 
true hash functions (e.g., MD5, SHA1) as F becomes 
unpractical if we want to avoid multiple (one per ele- 
ment) cycle-intense hashing per packet. 

As an example resource-efficient implementation of 
F, we propose the lightweight Algorithm[T]to mix each 
element key K with a fixed bit string /. Taking / as an 
input, the algorithm runs in parallel on each element K 
and returns the k bit positions in the iBF to be set or 
checked. After an initial bitwise XDR operation (Step 
1), the output O is divided into k segments of m/k bits 
(Step 2). To build the folding matrix in Step 3, each 
segment is transformed into a matrix of c*log2(m) bitsQ 
For instance, with m - 256 and k — 4, each segment 
Ok would be a 64-bit bit vector transformed into a 8jc8 
matrix. Finally, each of the k output values is computed 
by XORing the rows of each matrix into a logzim) bit 
value that returns the bit position to be set/checked (Step 
4). The last d-bit shifting operation enables the power 
of choices capabilities. 

We are faced with the classic trade-off between se- 
curity and performance. An heuristic evaluation sug- 
gests that the proposed F provides a good balance be- 
tween performance and security. First, F involves only 
bit shifting and XOR operations that can be done in a 



1 Note that depending on the values of m and k, some padding bits 
(e.g., re-used from within the segment) may be required to complete 
the matrix. 



few clock cycles in parallel for every K. Second, the k 
bit positions depend on all the bits, within an m/k bit 
segment, from the inputs / and K. The security of F de- 
pends on how well / and K are mixed. For security sen- 
sitive applications, the XOR operation in Step 1 should 
be replaced with a more secure transformation P(K, I) 
i.e., using a lightweight cipher or hash function. In gen- 
eral, F should take the application specifics into account 
(e.g., nature of K, computation of / per-packet) and the 
target security level. 

4.3.2. Time-based keyed hashing 

A more elaborate security extension consists of using 
a keyed element name construction, and change the se- 
cret key S (f) regularly. We can S (?) as the output of a 
pseudo random function 5, = F(seed, ?,), where seed is 
the previous seed value and t a time-based input. Then, 
we can include the current S value in the algorithm 
for element check/insertion e.g., O - F(K,I,S(t)). 
Thereby, we have a periodically updated, shared secret 
between iBF issuers and iBF processing entities, with 
the benefit that an iBF cannot be re-utilized after a cer- 
tain period of time or after an explicit re-keying request. 
Moreover, by accepting 5, and S,-_i the system requires 
only loose synchronization similar to commercial time- 
coupled token generators. At the cost of initial syn- 
chronization efforts and computational overhead, this 
method provides an effective countermeasure to protect 
the system from compromised iBF attacks (cf., DDoS 
protection with self-routing capabilities [ 19]). 

4.4. Density factor 

Another iBF security measure, also proposed in 0, 
is to limit the percentage of Is in the iBF to 50-75%. 
A density factor p max can safely be set to k * n max /m, 
as each legitimate element contributes with at most k 
bits. Then, the probability of an attacker guessing a bit 
combination causing a single false positive can be upper 
bounded by p^. 

5. Practical evaluation 

In this section, we turn our attention to the practi- 
cal behavior of the iBF in function of the multiple de- 
sign parameters and carry out extensive simulation work 
to validate the usefulness of the three extensions under 
consideration. For these purposes, we use randomly 
generated bit strings as input elements and the double 
hashing technique using SHA1 and MD5. The section 
concludes exploring briefly the potential impact of dif- 
ferent types of iBF elements (flat labels, IP addresses, 
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dictionary entries) and the hash function implementa- 
tion choice. 

5.7. Element Tags 

We are interested in evaluating the gains of the power 
of choices that underpins the element Tag extension 
(§ 14.11 ). where any element set can be equivalently rep- 
resented by d different iBFs, different in their bit distri- 
bution but equivalent with regard to the carried element 
identities. We first explore the case where k — 5 and 
then the impact of using a distribution around 5 for can- 
didate naming 

5.1.1. Power of choices (d) 

We run the simulations varying d from 2 to 64 and 
updating m accordingly to to reflect the overhead of in- 
cluding the value d in the packet header. Fig. [4] com- 
pares the observed fpr for different values of d. We 
see that by increasing d and choosing the candidate 
iBF just by observing its fill factor after construction 
(Fig. |4(a)| > leads to better performing iBFs. In the re- 
gion where the iBF is more filled (30-40 elements), the 
observed fpr drops between 30% and 50% when 16 or 
more candidate iBFs are available. Another interpreta- 
tion is that for a maximal target fpr we can now insert 
more elements. As expected, the performance gain is 
more significant if we consider the best performing iBF 
candidate after testing for false positives. Observing 
Fig. |4(b)| the number of false positives is approximately 
halved when comparing the best iBF among 16 or more 
against a standard 256-bit iBF. In general, we note that 
the observed fpr is slightly larger than the commonly 
assumed theoretical estimate (Eq. [TJ, confirming thus 
the findings by 111311 (Eq.|4]i. This difference is more no- 
ticeable for small values of m and becomes negligible 
for values larger than 1024 (see TableQ}. 

5.7.2. Distribution of the number of hash functions (k) 
Now, we allow a different number of bits k per can- 
didate. For instance, with d — 8 the distribution of k 
among the candidates is [4,4,5,5,6,6,7,7]. Intuitively, 
this naming scheme adapts better to the total number 
of elements in the iBF (k closer to k op , = ln(2) * 2»). 
The fpa-based selection criterion (§ 14. U is now choos- 
ing the candidate with the lowest estimate after hash- 
ing: min{po k °,...,pd }■ Fig. [5(a)] shows the distribu- 
tion of the selected 256-bit iBFs for the case of d = 16 
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Figure 4: Power of choice gains (m=256, k=5). 



and k evenly distributed between 4 and 7. The line 
shows the percentage of times that the selected iBF actu- 
ally yielded the best performance among the candidates. 
Disregarding the scenarios with fewer elements, the fpa- 
based selection criteria succeeded to choose the optimal 
candidate in about 30% of the times. Fig. |5(b)| shows the 
percentile distribution of the best performing iBF after 
fpr testing. As expected, in more filled iBFs scenarios, 
setting less bits per element is beneficial. However, the 
differences are relatively small. As shown in Table Q] 
the observed fpr in the case of k const , — 5 is practically 
equivalent (if not slightly better) to the case where k is 
distributed. We can also observe what the theory in §|2] 
predicts with regard to smaller iBFs and (i) their inferior 
fpr performance for the same m/n ratio, and (ii) their 
larger potential to benefit from the power of choices. 



2 We choose k = 5 to have a probabilistically sufficient footprint 



space for the eTags (m\l{m — k)\ st 10 
an m/n of about 8 bits per element. 



with m = 256) when targeting 



5.7.3. Discussion 

Based on our experimental evaluation, having more 
than 32 candidates per element does not seem to bring 
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Figure 5 : Distribution of iBF candidates for different number of hash 
functions k. (d=16, m=256). 

benefits in terms of performance. If the system design 
choice is based on selection criteria optimized for the 
non-presence of specific false positives (i.e. element- 
avoidance § I4.lt . increasing the number of choices d 
allows complying to a larger set of false positive avoid- 
ance policies. The practical limitation appears to be 
solely how much space to store the candidate element 
representations the application designer is willing to 
pay. 

5.2. Deletion 

We explore two important aspects of the deletable re- 
gions extension. First, from a qualitative point of view 
we examine the actual capabilities to successfully delete 
elements for different mjn ratios, number of regions r 
and choices d. Second, we evaluate the quantitative 
gains in terms of false positive reduction after element 
bits are deleted. Obviously, both aspects are related and 
intertwined with the ability to choose among candidate 



Table 1: Observed fpr for iBFs with 16 eTag choices. 



m 


n 


Std. (%) 


fpa-opt. (%) 


fpr-opt. (%) 


Th. 


fpr 


k cte 




k c te 




128 


6 


0.04 


0.16 


0.14 


0.19 


0.04 


0.05 


12 


0.75 


1.12 


0.88 


0.86 


0.37 


0.32 


18 


3.33 


4.39 


2.80 


3.10 


2.18 


2.37 


256 


12 


0.04 


0.09 


0.08 


0.08 


0.01 


0.03 


24 


0.74 


0.95 


0.74 


0.71 


0.26 


0.30 


36 


3.31 


3.63 


2.69 


2.75 


2.07 


2.15 


512 


24 


0.04 


0.08 


0.07 


0.04 


0.01 


0.01 


48 


0.74 


0.83 


0.64 


0.64 


0.22 


0.25 


72 


3.29 


3.46 


2.87 


3.05 


2.09 


2.21 



iBF representations to favor the deletion capabilities. 
Now, the application can choose the iBF candidate with 
the most number of bits set in collision free-zones, in- 
creasing thus the bit deletability . Alternatively, one may 
want to favor the element deletability, recalling that re- 
moving a single element bit is traduced into a practical 
deletion of the element. 

Using our basic coding scheme C§ 14.2b , we consume 
one bit per region to code whether collision happened 
and deletion is prohibited or not. Thus, the bits available 
for iBF construction are reduced to m' = m-log2(d)-r. 

5.2.1. Quality 

We now evaluate how many of the inserted elements 
can be safely removed in practice. Fig. |6(a)| plots the 
percentage of elements that can be deleted. As ex- 
pected, partitioning the iBF into more regions results in 
a larger fraction of elements (bits) becoming deletable. 
For instance, in the example of a 256-bit iBF with 32 
regions (Fig. [6), when 24 elements are inserted, we 
are able to delete an average of more than 80% of the 
elements by safely removing around 50% of the bits 
(Fig. |6(b)| i. Playing with the candidate choices, we 
can enhance the bit (Fig. |7(a)| i and element (Fig. |7(b)| l 
deletability considerably. Finally, we were able to vali- 
date experimentally the mathematical model of the ele- 
ment deletability probability (Eq. [7)- 

5.2.2. Quantity 

Next, we explore the fpr gains due to bit deletability. 
On one hand, we have the potential gains of removing 
bits from collision-free zones. On the other, the cost of 
(1) coding the deletable regions (r bits), and (2) having 
more filled iBFs due to the rarefication of colliding bits. 
While Fig. |8(a)| shows the price of having to code more 
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Figure 6: Deletability as function of r (m=256). 



Figure 7: Deletability as f(d). m=256, r=16. 



regions, Fig. |8(b)| illustrates the potential gains of re- 
moving evert deletable bits are removed. If we average 
the fpr before and after elements are deleted, the iBF 
performance appears equivalent to the fpr of a stan- 
dard non-deletable m-bit iBF. In comparison, a count- 
ing BF with 2 bits per celH would behave like an iBF 
of size m/2, which would have its element capacity ac- 
cordingly constrained. 

Analyzing the power of choices, Fig. [9] confirms the 
intuition that choosing the best deletable iBF candi- 
date causes the colliding bits to "thin out" (greater p), 
yielding a higher fpr before deletion (Fig. |9(a)| i and a 
smaller fpr after elements are removed (Fig. 9(b)| i. 



5.2.3. Discussion 

There is a tussle between having a smaller fill factor 
p, with more collisions at construction time reducing the 



3 Using the power of choices, we could have with very high proba- 
bility a candidate that does not exceed the counter value of 3, avoiding 
false negatives as long as no new additions are considered. 



fpa, and the deletability extension that benefits from 
fewer collisions. Deletability may be a key property for 
some system designs, for instance, whenever an element 
in the iBF should be processed only once and then be re- 
moved, or when space is needed to add new elements on 
the fly. A more detailed evaluation requires taking the 
specific application dynamics (e.g., frequency of dele- 
tions/insertions) into consideration. 

From a fpr performance perspective, the cost of cod- 
ing the deletable regions is only a slight increase in the 
fpr due to r being only a fraction of m. However, this 
practical bit space reduction seems to hinder, on aver- 
age, the potential fpr gains due to bit deletions up- 
front. Nevertheless, the deletable regions extension is 
a far more attractive approach to enable deletions in the 
space-constrained iBFs than alternative solutions based 
on counting BFs. An open question is whether we are 
able to find a more space-efficient coding scheme for 
the deletable regions. Finally, the power of choices 
again proved to be a very handy technique to deal with 
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Figure 8: False positives in function of r (m=256, d=16). 



Figure 9: False positives as f(d). (m=256, r=16). 



the probabilistic nature of hash-based data structures, 
enabling candidate selection for different criteria (e.g., 
fpr, element/bit deletability). 

5.3. Security 

Besides fast computation, the main requirements for 
the security extension are that (i) the random distribu- 
tion of the iBF bits is conserved, and (ii) given a collec- 
tion of packets / and the securely constructed iBFs, one 
cannot easily reveal information about the inserted ele- 
ments (K). More generally, given a set of (/, iBF) pairs, 
it must be at best very expensive to retrieve information 
about the identities of K. 

We first measured the randomness of the secure iBF 
construction outputs from Algorithm Q] by fixing a set 
of 20 elements and changing the per-packet 256-bit ran- 
domly generated / value on each experiment run. Ta- 
ble [2] gathers the average results of 100 experiments 
with 1000 runs per experiment. The observed distri- 
bution of outputs within an experiment, measured as 



the Hamming distance between output bit vectors (B V), 
was very close to the mean value of m/2 bits (128) with 
a small standard deviation^ The observed average num- 
ber of bits set and their distribution were comparable 
to standard iBF constructs. Additionally, we analyzed 
whether the 20 most frequent bit positions set in secure 
iBFs corresponded to bits set in plain iBFs. We defined 
the correlation factor as the fraction of matches and ob- 
tained a value of 0, 371, which is close to the probability 
of randomly guessing bits in a 256-bit iBF with k — 4 
and n = 20 elements (Pr * 96/256 ~ 0, 37). 

The results indicate that, assuming a random packet 
identifier /, first, no actual patterns can be inferred from 
the securely inserted elements, and second, the random 
bit distribution of an iBF is conserved when using the 



4 In future work we will extend these results and the 
hashing techniques evaluation of Sec. 15.41 with other ran- 
domness tests such as those included in the Diehard suite 
(http://www.stat.fsu.edu/pub/diehard). 
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proposed algorithm. However, we recognize the limi- 
tations of Alg. Q] For instance, if provable protection 
against more elaborated attacks is required, then, a more 
secure and computationally expensive bit mixing proce- 
dure (Step 1 in Alg.Q]) should be considered, in addition 
to a time-based shared secret as suggested in § 14.31 





Sec. iBF 


Plain iBF 


Random BV 


Hamming dist. 


127.94 (8.06) 





127.95 (8.03) 


# bits set 


96.27 (3.20) 


96.29 (-) 


127.97 (7,97) 


Correlation 


0.371 





Table 2: Evaluation of the secure iBF algorithm (m=256, k=4, n=20). 
Avg. (Stdev) after 1000 runs. 



5.4. Hashing technique 

Finally, we briefly investigate the impacts of the hash 
function implementation choice and the nature of the 
input elements in small size iBFs. There are two fac- 
tors that determine the "quality" of the bit distribution 
and consequently may impact the observed fpr. 1) the 
input bit string, and 2) the implementation of the hash 
function. 

5.4.1. Input data sets 

Instead of considering elements as simple random bit 
strings, we now explore three types of elements that 
cover typical inputs of iBF applications: 

• 32-bit IP addresses: Nearly 9M IP addresses 
were generated by expanding the subnet values of 
IP prefixes advertised in the CAIDA database^ 
In addition, private IP addresses (10.0.0.0/16, 
192.168.0.0/16) were also used in the experiments. 

• 256-bit random labels: A set of 3M random la- 
bels was generated constructing each 256-bit label 
by picking randomly 64 hex characters and check- 
ing for uniqueness. 

• Variable-bit dictionary words: A set formed by 
98.568 entries of the American dictionary]^ 

5.4.2. Hash function choice 

We chose 3 commonly used cryptographic hash func- 
tions (MD5, SHA1 and SHA256) and 2 general purpose 
hash functions (CRC32 and BOB)[H 



The observed fpr (Table [3} imply that, on average, 
the input type does not affect the iBF performance. Fig- 
ure [TUlplots the observed normalized sample variance 
for different bit vector sizes (to). For lower to values the 
variances show a larger difference and start converging 
for m > 512. CRC presents the best output distribution 
when dealing with IP addresses as inputs. This may be 
explained by the 32-bit match of inputs and outputs. In 
general, the functions exhibit similar behavior, leading 
to the conclusion that all 5 hash functions can be used in 
iBF scenarios independently from the nature of the ele- 
ments. This result experimentally confirms, also in the 
case small to values, the observation by Mitzenmacher 
and Vadhal [21] that given a certain degree of random- 
ness in the input, simple hash functions work well in 
practice. 



n 


DoubleHash 


IP 


Random 


Diet. 


16 


SHA1&MD5 


0.340 (0.035) 


0.338 (0.032) 


0.328 (0.034) 


CRC32 segm. 


0.345 (0.037) 


0.349 (0.034) 


0.338 (0.034) 


32 


SHA1&MD5 


2.568 (0.436) 


2.576 (0.449) 


2.519 (0.385) 


CRC32 segm. 


2.541 (0.418) 


2.532 (0.403) 


2.570 (0.444) 



Table 3: Observed fpr in 256-bit iBF using double hashing with 
SHA1 & MD5 and with 8-bit segments of CRC32. Avg. (StdDev); 
1000 tests. 



5.4.3. Hash segmentation technique 

We note that for the purposes of iBF construction, 
there is a waste of hash output bits due to the mod to 
residual restrictions. Hence, we want to know whether 
we can divide the output of a hash function into logi(m) 
segments and use each segment as an independent hash 
value. We compare the bit distribution and fpr per- 
formance of iBFs constructed using the double hashing 
technique with MD5 and SHA1 against iBFs generated 
with CRC32 segments as /j,(x). The differences of the 
observed fpr (Table [3} are negligible, which suggests 
that we may indeed use this hashing technique in prac- 
tice. As a practical consequence, we can reduce the two 
independent hash function requirement of the double 
hashing technique to a single hash computation based 
on e.g., CRC32 or BOB. This result can be applied to 
iBF networking applications with on-line element hash- 
ing instead of pre-computed element names. Moreover, 
the hash segmentation technique may be useful in other 
multiple-hashing-based data structures (e.g., d-left hash 
tables) that require hashing on a packet basis. 



5 f tp . ripe . net/ripe/stats/delegated-r ipencc-20090308 
Vusr/share/dict/american-english 
'Related work has investigated the properties of 25 popular hash 
functions, pointing to BOB as a fast alternative that yields excellent 



randomized outputs for network applications 1 20] . Although MD5 
and SHA1 are considered broken due to the recent discovery of colli- 
sions, they are perfectly valid for our randomness purposes. 
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functions. 



6. Relevance and related work 

Although multiple variants of Bloom filter designs 
and applications have been proposed in the last years 
(e.g., Bloomier, dynamic, spectral, adaptive, retouched, 
etc.), to the best of our knowledge, none of the previ- 
ous work focuses on the particular requirements of dis- 
tributed networking applications using small Bloom fil- 
ters in packet headers. 

Prior work on improved Bloom filters include the 
Power of Two Choices filter lfl4ll and the Partitioned 
Hashing B22I1 . which rely on the power of choices at 
hashing time to improve the performance of BFs. False 
positives are reduced in B22I1 by a careful choice of the 
group of hash functions that are well-matched to the in- 
put elements. However, this scheme is not practical in 
distributed, highly dynamic environments. The main 
idea of 11411 is to reduce the number Is by choosing 
the "best" set of hash functions. Besides our in-packet- 
header scope, our approach differs in that we include the 



information of which group of hash functions was used 
(d value) in the packet itself, avoiding thereby the caveat 
of checking multiple sets. On the other hand, we need to 
stick to one set of hash functions for all elements in the 
BF, whereas in 1 14] the optimal group of hash functions 
can be chosen on an element basis. To our benefit, due 
to the reduced bit vector scenario, we are able to select 
an optimal BF after evaluating all d candidates, which 
leads to improved performance even in very dense BF 
settings (small mjn ratios). 

Regarding the notion of choosing the best candidate 
filter, the Best-of-N method B15I1 only considers a stan- 
dalone application where the best BF selection is based 
on the least dense filter constructed with the optimal 
number kopt of hash functions is. Our distributed iBF 
applications consider candidates with different amount 
of bits set (kdistr) as the maximum set cardinality may 
be unknown and hashing at packet processing time may 
not be an option. Moreover, we note that distributed 
iBF applications may (1) be able to test for presence of 
elements to be queried upfront and selected the best ob- 
served fpr candidate, and (2) selection criteria may be 
beyond reducing fpr, for instance benefiting the dele- 
tion of elements or avoiding specific false positives. 

As far as we can say, our scheme for deleting items 
from a BF based on coding the regions where collisions 
have happened has not been proposed before. Due to 
its space efficiency, the false-negative-free characteris- 
tics, and the high probability of successful deletions, the 
Deletable Regions extension may have interesting appli- 
cations beyond the scope of iBFs. The closest BF design 
innovation to support deletions, other than by counting 
BFs or d-left fingerprint hash tables, are the Variable- 
length Signatures (VBF) by Lu et al Q. While both 
element deletions are based on resetting at least one bit 
from an element signature, our scheme introduces no 
false negatives at the cost of providing only probabilis- 
tic guarantees for element deletion. 

Security and privacy preserving extensions for BFs 
have been previously studied in 11181 12411 and our 



main novelty resides in taking distributed systems and 
data packets specifics into consideration (e.g., flow- 
identifier, time-based loose synchronization of dis- 
tributed secrets). 

Recently, our work on iBFs has contributed to enable 
a novel packet forwarding scheme with built-in DDoS 
protection [19]. The notion of element Tags has its roots 
in our work on the link identifier based forwarding fab- 
ric in (8J], rendering the system more useful (network 
policy compliance, loop avoidance, security) and effi- 
cient (fpr control, larger multicast groups). Similarly, 
the d-eTag extensions may be applied to the case of IP 
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multicast |01 to reduce false positives and compliance 
to inter-domain AS policies in the case of false posi- 
tives. When applied to the credentials based architecture 
proposed in [5], multiple candidates may allow iBFs to 
transverse larger paths before reaching the maximum 
density. Moreover, the security extension may pro- 
vide extra protection from an en-route attacker spoof- 
ing the source IP address and re-using the flow creden- 
tials for unauthorized traffic. Finally, the hash segmen- 
tation technique appears useful to lower the burden on 
networking elements when computing multiple hashes 
on a packet basis. 

7. Conclusions 

This paper explores an exciting front in the Bloom fil- 
ter research space, namely the special category of small 
Bloom filters carried in packet headers. Using iBFs is a 
promising approach for networking application design- 
ers choosing to move application state to the packets 
themselves. At the expense of some false positives, 
fixed-size iBFs are amenable to hardware and present 
a way for new networking applications. 

We studied the design space of iBFs in depth and 
evaluated new ways to enrich iBF-based networking ap- 
plications without sacrificing the Bloom filter simplic- 
ity. First, the power of choices extension shows to be 
a very powerful and handy technique to deal with the 
probabilistic nature of hash-based data structures, pro- 
viding finer control over false positives and enabling 
compliance to system policies and design optimization 
goals. Second, the space-efficient element deletion tech- 
nique provides an important (probabilistic) capability 
without the overhead of existing solutions like count- 
ing Bloom filters and avoiding the limitations of false- 
negative-prone BF extensions. Third, security exten- 
sions were considered to couple iBFs to time and packet 
contents, providing a method to secure iBFs against 
tampering and replay attacks. Finally, we have validated 
the extensions in a rich simulation set-up, including use- 
ful recommendations for efficient hashing implementa- 
tions. We hope that this paper motivates the design of 
more iBF extensions and new networking applications. 
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Appendix A. Mathematical Model for Element 
Deletability Probability 

Consider a vector of size m, k hash functions and r 
regions. The bit vector size to construct the actual BF is 
m' = m — r and the number of bit cells in each region is 
equal to \m'/r~\. The probability of two hash functions 
setting a given bit cell: 



1 



Pl [m'hm'j U' 2 



1 



(A.1) 



If we use one hash function per element (i.e. one bit) 
and insert n elements, the probability of having at least 
one collision in a given cell is: 
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Appendix B. Mathematical model for d-candidate 
fpa optimization (adapted from 1 15]) 

Given the iBF parameters m, n, k, and letting d be the 
number of different iBF candidates for the same element 
set, the probability of setting s bits in an iBF candidate 
can be formulated as an independent random variable 
experiment: 



E [s] = m 
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Defining fi — E[s] and <x = cr[s], the minimum con- 
tinuous probability density function is: 
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Consequently, the expectation of the least number 
bits (s m i„) set by any of d candidates: 
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Finally, the probability of a false positive once the 
smallest fill ratio has been estimated: 



pr [falsepositive] 



(B.5) 



Considering that each element insertion sets k bits to 1, 
we compute the probability to: 



P3 
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Equation IA.3I considers a collision in a given bit cell. 
Now, we extend the probability to the m'/r bit cells in a 
single region: 



p 4 = (1 -P3) 
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Replacing m' with m - r, the probability of an element 
being deletable, that is of having at least one bit set in a 
collision-free region is: 
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Finally, neglecting the contributions of the terms for i > 
2 as they tend to zero, we get: 
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